Exploiting structural similarity for effective Web information extraction
نویسندگان
چکیده
منابع مشابه
Exploiting Structural Similarity For Effective Web Information Extraction
In this paper we propose an architecture that exploit web pages stuctural information for the extraction of relevant information from them. In this architecture, a primary role played by a distance-based classification methodology is devised. Such a methodology is based on an efficient and effective technique for detecting structural similarities among semistructured documents, which significan...
متن کاملExploiting ASP for Semantic Information Extraction
The paper describes HıLεX, a new ASP-based system for the extraction of information from unstructured documents. Unlike previous systems, which are mainly syntactic, HıLεX combines both semantic and syntactic knowledge for a powerful information extraction. In particular, the exploitation of background knowledge, stored in a domain ontology, allows to empower significantly the information extra...
متن کاملPersonalized Web Services for Web Information Extraction
The field of information extraction from the Web emerged with the growth of the Web and the multiplication of online data sources. This paper is an analysis of information extraction methods. It presents a service oriented approach for web information extraction considering both web data management and extraction services. Then we propose an SOA based architecture to enhance flexibility and on-...
متن کاملWeb Information Extraction Systems for Web Semantization
In this paper we present a survey of web information extraction systems and semantic annotation platforms. The survey is concentrated on the problem of employment of these tools in the process of web semantization. We compare the approaches with our own solutions and propose some future directions in the development of the web semantization idea.
متن کاملMeasuring Structural Similarity Among Web
When we describe a Web page informally, we often use phrases like \it looks like a newspaper site", \there are several unordered lists" or \it's just a collection of links". Unfortunately, no Web search or classi cation tools provide the capability to retrieve information using such informal descriptions that are based on the appearance, i.e., structure, of the Web page. In this paper, we take ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Data & Knowledge Engineering
سال: 2007
ISSN: 0169-023X
DOI: 10.1016/j.datak.2006.01.001